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ABSTRACT 

A method of classification of digi- 
tized mul ti spectral image data is 
described. It is designed to exploit a 
particular type of dependence between 
adjacent states of nature that is 
characteristic of the data. The advan- 
tages of this, as opposed to the 
conventional “per point" approach, are 
greater accuracy and efficiency, and the 
results are In a more desirable form for 
most purposes. Experimental results 
from both aircraft and satellite data 
are Included. 


I. INTRODUCTION 

An Important subject before the engineering and 
scientific cotranunity at the present time is the processing 
of scenes which represent tracts of the earth's surface as 
viewed from above. A typical scene may consist primarily of 
regular and/or irregular regions arranged in a patchwork 
manner, each containing one "class" of surface cover type. 
These homogeneous regions are the "objects" in the scene. A 
basic processing goal Is to locate the objects, identify 
(classify) them, and produce tabulated results and/or a 
"type-map" of the scene. As in other Image processing 
applications, the locations and spatial features (size, 
shape, orientation) of objects are revealed by changes in 
average spectral properties that occur at boundaries. But 


* This work was supported by NASA through Grant N6L 15-005-112 
and Contract NAS 9-14016. 
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ufiftttft ma&t other appi I cations# these spatial features often 
enahl^ onliy a raufth catesorlzatlon of the object. Therefore 
cTasel^ I cation is iiKtre often based on Its spectral featurea 
using statistical pattern recognition techniques# a task for 
««hich the digital- computer Is well adapted. 

Computer classification of multi *spectral scanmr (MSS> 
data collected over a region Is typically done by applying a 
"slng>le symmetric" (tecislon rule to each resolution eTement 
(pixel). This means that each pixel Is classified 
Individually on the basis of Its spectral measuremeots 
alone. A basic premise of this technique Is that tlie 
objects of Interest are Targe compared to the slat of a 
pixel. Otherwise a large proportion of pixels would bsr 
composites of two or more classes# making sisatlstlca) 
pattern cTasslf Icatlon unreliable; l.e. the prespeclfted 
categories would be Inadequate to describe the actuai states 
of nature. Since the sampling interval is usually 
comparable to the pixel sire (to preserve system 
resolution)# it follcMnm that each object Is represented by 
an array of pixels. This suggests a statistical dependence 
between consecutive states of nature# which the simple 
symmetric classifier falls to exploit. To reflect this 
property# we shall refer to simple symmetric classification 
as "no-memory" classification. 

One method for dealing with dependent states is to 
apply the principles of compound decision theory or 
sequential compound decision theory. Abend ill points out 
that a sequential procedure can be implemented fairly 
efficiently when the states form a low-order Markov chain. 
However the prospect is considerably less attractive when 
they form a Markov mesh# which is a more suitable model for 
two-dimensional scenes. Furthermore# estimation of the 
state transition probabilities could be another significant 
obstacle to implementation of such a procedure. 

The compound decision formulation Is a powerful 
approach for handling very general types of dependence. 
This suggests that perhaps by tailoring an approach more 
directly to the problem at hand# one can obtain similar 
results with considerable simplification. A distinctive 
characteristic of the spatial dependence In MSS data is 
"redundance"; l.e. the probability of transition from state 
I to state j is much greater If j*I than if Jfi# because the 
sampling interval Is generally smaller than the size of an 
object. This suggests the use of an "image partitioning" 
transformation to delineate the arrays of statistically 
similar pixels before classifying them. Since each 
homogeneous array represents a statistical "sample" (a set 
of observations from a common population)# a "sample 
classifier" could then be used to classify the objects. In 
this way# the classification of each pixel in tne sample is 



a result of the spectral properties of its neighbors as v/e1 1 
as Its own. Thus Its "context” in the scene is used to 
provide better classification. The acronym ECHO (extraction 
and classification of homogeneous objects) designates this 
general approach. 

A characteristic of both no-memory and compound 
decision techniques Is that the number of classifications 
which must be performed is much larger than the actual 
number of objects in the scene. When each classification 
requires a large amount of computation, even the no-memory 
classifier can be relatively slow. An ECHO technique v^ould 
substantially reduce the number of classifications, 
resulting in a potential increase In speed (decrease In 
cost) . 


The recent literature contains numerous references to 
Image partitioning algorithms. Robertson 121 divides them 
into two main categories. "Boundary seeking*' algorithms 
characteristically attempt to exploit object contrast. Two 
of‘ these have been implemented with MSS data 131, but they 
are incompatible with sample classifiers due mainly to their 
failure to produce boundaries that always close on 
themselves. The other category can be called "object 
seeking" algorithms, which characteristically exploit the 
internal regularity (homogeneity) of the objects. As the 
name implies, an object seeking algorithm always produces 
well-defined samples (and thus closed boundaries as well). 
There are two opposite approaches to object seeking, v;hlch 
we shall call conjunctive and disjunctive. A conjunctive 
algorithm begins with a very fine partition and simplifies 
it by progressively merging adjacent elements together that 
are found to be similar according to certain statistical 
criteria |4,51. A disjunctive algorithm begins with a very 
simple partition and subdivides it until each element 
satisfies a criterion of homogeneity. For example, 
Robertson's algorithm |2,61 is based on the premise that if 
a region contains a boundary, splitting the region 
arbitrarily v/ili usually produce two subregions with 
significantly different statistical characteristics. 

Vie combined Rodd's i5| conjunctive partitioning 
algorithm with a minimum distance sample classifier and 
observed an Improvement in classification accuracy over 
conventional no-memory classification, but processing time 
was increased |7i. Gupta and VHntz |3| added a test of 
second order statistics to Rodd's first order test, but 
obtained essentially the same results as the first order 
test at greater cost in processing time. Robertson |2,6t 
Implemented a disjunctive partitioning algorithm with the 
same minimum distance classifier. He obtained about the 
same classification accuracy as conventional no-memory 
classification with an order of magnitude increase in 
processing time. 
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The current investigation is devoted to further 
development and testing of the conjunctive approach. Major 
changes in both the classification and partitioning 
strategies have resulted in significant improvements in 
accuracy^ stability^ and speed. 


II. SAMPLE CLASSIFICATION 

A typical scene is assumed to consist primarily of 
objects whose boundaries form a partition of the scene. 
Each object in the partition belongs to one of K classes. 
Let denote the event that an object belongs to class i. 
As previously indicated, we Ignore any statistical 
dependence of this event on the size, shape, and location of 
the object. We rely Instead on its spectral features. Each 
pixel in an object is a q-dlmens ional random variable, where 
q denotes the number of spectral measurements per pixel. It 
Is commonly assumed that the q-varlate, marginal, 

probability density function (pdf) of a pixel, X, depends 
only on the class of the object containing JL* This is due 
to the homogeneity of the types of objects typically 
encountered in remote sensing applications. p(itli'Jl)/ 
denotes this class-conditional density function for the iJJl 
class. Another common assumption is that the classes can be 
defined such that pCiiiW^} is approximately multi-variate 
normal (MVN); i.e. 


p(jllWi) ■ • (|2irCil exp((A“iii ) ) ) 

for some q-dlmens lonal positive-definite, covariance matrix 
Cj and some mean vector . Parametri . estimates of 

these density functions are^'obtained by estimating and 
from sets (samples) of training data supplied for eacn 
class. 


Two pixels in spatial proximity to one-another are 
unconditionally correlated, with the degree of correlation 
decreasing as the distance between them increases. Much of 
this correlation is attributable to the effect of dependent 
states mentioned in the previous section, which is the 
effect we wish to exploit. For simplicity we shall Ignore 
other sources of correlacion. Thus we assume 
class-conditional Independence (as does the compound 
decision approach). 

If X**()Lj , . . ) e represents a set of pixels in 

some object, then this set constitutes a "sample" from a 
population <'haracterized by one of the class-conditional 
pdf's. A sample classifier is simply a strategy for 
deciding which one, based on the n observations. One 
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popular approach Is the "minimum distance (I4D) strategy" 
|9|. In MD classification, the n data vectors are used to 
estimate the pdf of the population, and the class Is chosen 
whose pdf Is closest to this estimate as measured by some 

appropriately defined "distance measure" on the set of 

density functions. A popular distance measure Is the 
Bhattacharyya distance, which for and Is 

given by: ^ 

l(C|+ 0/21* .1 ♦. ^15 

B ■ 1C In " t c J T cT" " * ^CM,-M)(M,-M)M ) 

A drawback of the MD approach Is that It fails for small n, 
because the density estimate becomes degenerate. 

Our preference is the maximum likelihood (ML) strategy 
which assigns X to class I If 

In p(X|W. } ■ max In pCXIW. ) 

1 j ^ 

Due to the assumption of cl ass ’’conditional independence, 
these quantities can be computed as: 

In p(X|Wi) - -i tr(C^^S^) ♦ “1 ♦ ln|2TrC^I) 


S, - I li S - E 11 
* i-1 ^ I-l 

Of course: il • /n and C ■ S,/n - M 

Formula (2) Is much faster to compute that formula (1) for 
each tS.|/§ 2 ^ pair, once the non-data-dependent constants 
have been initialized. Thus the ML strategy is 
computationally efficient. Another important property is 
that it does not fall for small n. On theoretical grounds, 
for the idealized conditions we have stated, it is the 
optimum strategy (for minimum error rate) when the a-priorl 
class probabilities are equal. Also, the Chernoff bound for 
ML no-memory classification (n»l) can be extended to provide 
an error bound for ML sample cl ass if Icat Ion that is a sum of 
exponentially decreasing functions of the sample size. 
Experimentally the two strategies appear about equal in 
terms of accuracy, with the ML strategy possibly having a 
slight advantage. 

As a matter of theoretical interest, it can be shown 
that use of the ML strategy gives the same results (with 
less computation) as an MD strategy using one of the 
Kul Iback-Lelbler numbers, if ICi > 0. (?f jC! " 0, the K-L 
number is undefined, but the ML strategy Is still valid.) 
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III. IMAGE PARTITICNING 

The basic approach that we have adopted (due to Rodd 
151) consists of two "levels” of tests. Initially the 
pixels are divided, by a rectangular gi'ld. Into small groups 
of four (for example). At the first level of testing, each 
group becomes a unit called a "cell", provided that it 
satisfies a relatively mild criterion of homogeneity. Those 
groups that are rejected are assumed to overlap a boundary 
and their Individual pixels are classified by the no-memory 
method. These groups are referred to as "singular" cells. 
At this level it Is usually desirable to maintain a fairly 
low rejection rate to reflect the relatively high a-priorl 
probability of a group being homogeneous. The goal at this 
level is essentially the same as the goal of the boundary 
::eeklng techniques mentioned previously; i.e. to detect as 
many pixels as possible that lie along boundaries without 
requiring that the ones detected form closed contours or 
even be connected. 

At the second level, an individual cell is compared to 
an adjacent "field", which is simply a group of one or more 
connected cells that have previously been merged. Jf the 
two samples appear statistically similar by some appropriate 
criterion, then they too are merged. Otherwise the cell is 
compared to another adjacent field or becomes a new field 
itself. By successively "annexing" adjacent cells, each 
field expands until it reaches its natural boundaries, where 
the rejection rate abruptly Increases, thereby halting 
further expansion. The field Is then classified by a sample 
classifier, and the classification is assigned to all Its 
pi xels . 

This approach has the important advantage that it can 
be Implemented "sequentially"; i.e. raw data need be 
accessed only once and in the same order that it is stored 
on tape. This is important for practical, rather than 
theoretical, considerations. The flow chart in Figure 1 
indicates how it can be done. In this chart, the top of the 
scene is referred to as north, and the general processing 
sequence is from north to south. 

Many modifications to the basic flow chart are, of 
course, possible. One of the modifications we use involves 
comparing a cell to as many as three different fields at 
once (seeking the best "match"). Instead of one-at-a-t Ime. 
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Annexatio n Crlterfon 

Let X ■ represent the pixels in a group of 

one or more cells wmch have been merged by successive 
annexations. Let Y ** .(Ij / . . ./In) represent the pixels In an 
adjacent, non-singular cell. Since both X and Y have 
satisfied certain criteria of homogeneity, we assume that 
each Is a sample from a MVN population. Let f and g 
represent the corresponding density functions. It is 
desired to test the (null) hypothesis that f ° g. This is a 
composite hypothesis, since It does not specify f and g. 
The "likelihood ratio procedure" llOI provides an effective 
statistic for testing this hypothesis. Van Trees till 
refers to It as the "generalized likelihood ratio". Let 

H (x,y) ■ {p(x,y|f,g)! g»f, feO > 

0 

H^(x,y) - {p(x,y|f,g); fell , gel2 > 

where p(x,y|f,g) is the conditional joint density of X and Y 
evaluated at x c R"*! and ycR®^, and ft Is a set of MVN 
density functions. The assumption of class-conditional 
independence enables us to express the joint density of 
pixels as the product of their marginal densities. Thus; 

P(x,ylf,g) » p(xlf) p(y|g) 

n m 

- ( n f(x ))( n g(^)) 
l-l ^ i-l ^ 

The generalized likelihood ratio is given by; 

sup H (X,Y) max p(Xif) p(Y|f) 

A * ■ fen 

sup H (X,Y) max p(Xif) max p(Y|g) 

^ fen gen 

For an "unsupervised" approach to partitioning we take n to 
be the following set of functions of i 

n » {N(is.;M/C): M c R**, C - symmetric and positive-definite} 

Anderson 1121 shows that; 

A « A • A ( 3 ) 

12 

where 

A^ - (|A|/|B|)**^^ (4) 

A^ - (|A^/nl” lAy/m|“/|A/Nl^) 


(5) 
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N ■ n ■*’ m 

_ n m • 

X • I JLa/ti 1 - Z Ii/m 

1-1 1-1 

n _ n» „ _ 

- Z A - Z 

*1-1 ^ I-l 

(In order to assure non-singular matrices with probability 

one/ we need n>q/m>q. |12i) 

A ■ A^ Ay 

M - (nZ ♦ mi)/N 

lx - ^ Ui-MHiLi-M)*' - Ax ♦ n(Z“H)(I“M)"' 

*"1-1 

m _ _ ^ 

By • z - Ay + 

i-1 

B - Bx + By - A + mn(i-iKl-i) ' 

N 

Anderson also suggests modifying A by replacing the number 
of pixels In each sample by the number of degrees of 
freedom; I ,e, replace n by n-1/ m by m-1, and N by N-2 in 
formulas (4) and (5). In either case/ the statistics are 
invariant with respect to a linear transformation on the 
data vectors. It follows that their distributions under the 
null hypothesis are independent of the actual MVN population 
from which the samples are drawn. 

Therefore we can construct a significance test of the 
null hypothesis. and A^ are independent under the null 

hypothesis 1121/ so the procedure we use is to test A^ at 
significance level and A^ at level / and reject the 

null hypothesis If either test produces a rejection. 
(Cooley and Lohnes |13| give transformations of A^ and A, 
(the modified versions) with F-dlstrlbutlons under the null 
hypothesis.) The overall significance level is then « = 
l-(l-«i )(l-“^ ). Essentially/ A, tests the hypothesis of 
equal covariance matrices (second order statistics)/ and A^ 
tests the hypothesis of equal mean vectors (first-order 
stat I St ^cs ) . 
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These multivariate CHV) tests have the same weakness as 
MD classification, namely the problem of estimating a MVN 
density from a relatively small sample (sometimes known as 
the "dimensionality" problem). This led to the constraint 
m > q, a condition which Is often not met. Even when the 
condition Is met, poor estimates can result, leading to 
decision errors. One approach to this problem Is to reduce 
q by deleting features. It Is well-known, for example, that 
a subset of features used to train a classifier from small 
training samples can sometimes produce better classification 
results than the full set. With this approach, however, one 
is faced with the problem of choosing the subset. 

Another approach is to base the decision on the q, 
univariate, marginal distributions; I.e. simpiy consider the 
data in one spectral channel at a time. This has been 
termed a "multiple univariate" (MUV) approach, in each 
channel v^e test the univariate hypothesis that the means and 
variances of the two samples are equal. Since the 
boundaries may be strong in some spectral channels and weak 
in others, we accept the null hypothesis only if the 
univariate hypothesis Is accepted in all q channels. 
Besides avoiding the dimensionality problem, the MUV 
procedure requires le -. computation and simpler distribution 
theory. However, it must be pointed out that in situations 
where class separability is primarily a multivariate effect, 
the MV procedure may be more advantageous. 

For a "supervised" approach to partitioning we take U 
to be: 

G ■ {p(aIWji^): 1»1,...,K} 

This greatly simplifies each hypothesis, but paradoxically 
the resultant test criterion is much more complicated: 

max p(X|Wj^) p(YlW^) (6) 

A » i ! 

max p(X|W.) max p(YlW.) 

I ^ j ^ 

This is a multivariate statistic without the constraint 
ni > q that was necessary in the unsupervised mode. However 
the maxima in formula (6) cannot be expressed in a simple 
analytic form as in (3). They can only be obtained by 
exhaustive search. Furthermore, the distribution of (6) Is 
unknown under either hypothesis, because it depends on the 
true classes of X and Y. But in return v/e gain a statistic 
•which should be more "sensitive" to the presence or absence 
of a boundary. This should produce better performance and 
make the specification of a decision threshold less 
critical. In fact, the experimental results Indicate that 
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the threshold need not be a function of n, the current size 
of sample X, In order to obtain good results* Furthermore/ 
the results tend to be fairly staLle over several orders of 
magnitude of threshold variation. Thus we will find it 
convenient to represent the decision threshold as 

T - 10"^/ t 0 

In other wordS/ we reject the null hypothesis If A < T or 
equivalently -log A > t. Otherwise we accept It. 
Experimentally we Investigate the effect of different values 
of t on performance. 


Cell Selection Criterion 

"Cell selection” refers to the Level-1 test which Is 
used to detect cells that overlap boundaries. Such cells 
frequently exhibit abnormally large variances. Thus/ In the 
unsupervised mode, we say that a cell Is singular If the 
ratio of the square root of the sample variance to the 
sample mean falls above some threshold, c. In any channel. 


In the supervisee mode we call a cell singular If 
Qj(Y) > c^ where: 

Oj(Y) - tr(Cjl ij) - Yj, 


where j Is such that: 

In p(YlW^) * max In p(YlWi> ■ max “l(m»l nl 2 ttC j I + Q^CY)) 
^1 12 1 

The decision rule Is to accept the hypothesis that Y is 
homogeneous If Qj?Y) < c, where c Is a prespecified 
threshold. Otherwise the hypothesis Is rejected. This 
criterion has the particular advantage that it tends to 
reject not only inhomogeneous cells, but "unrecognizable" 
cells as well. (Unrecognizable cells are those which 
represent spectral classes that the classifier has not been 
trained to recognize.) Another advantage of this criterion 
is that its use of the log- 1 1 kel I hood function makes it 
especially compatible with the supervised annexation 
criterion and the ML sample classifier. 

As a final note, the distribution function 
P(Qj(Y) > clV^j) Is chi-squared with mq degrees of freedom, 
Thts can be used to provide initial guidance In choosing c. 


MA. - yjiEBUU — 1 1 in jmuii '.JWLLPmi.unmMj.WH BWF 
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IV. EXPERIMENTAL RESULTS 

Two aircraft and two LANOSAT-1 data sets, for which 
large amounts of training and test data are available, were 
classified by the following six methods: 

1. Conventional ML No-Memory Classification |14| 

2. Supervised Cell Selection only (t*0); ML Sample 
Classification 

3. "Optimized" MUV Unsupervised Part Itlcnlng; ML Sample 
Classification 

4. Supervised Partitioning Ct*4); ML Sample Classification 

5. ML Sample Classification of Test Areas Only 

6. MD (Bhattacharyya) Sample Classification of Test Areas 
Only |14| 

The cell size for #2-#4 was fixed at 2 x 2 pixels, which Is 
the minimum allowed in the unsupervised mode. 

A qualitative assessment of the results Is provided by 
Figures 2 and 3. Figure 2 (left side) shows a section of 
aircraft data that has been classified by method #1* Each 
class has been assigned a gray level, and each pixel has 
been displayed as the gray level assigned to its 
classification. A great deal of "Classification noise" Is 
readily apparent. In contrast to this, F'gure 2 (right 
side) shows the same section as classified by method #4. 
The random errors have, for the rrost part, been eliminated. 
This map Is much closer to the desired "type-map" form of 
output that is generally desired. 

Figure 3 shows the centers o^ these two maps In greater 
detail. Each class is represented by an assigned symbol and 
each symbol represents one pixel. The four ’'octangular 
areas are test areas designated as wo'^ded pasture (displayed 
as a blank). The diversity of symbols In the test areas 
testifies to the Inadequacy of the no-memory method for 
classifying this section, whereas most of the confusion is 
avoided by the ECHO technique. 

The estimated probability of error for each method 
gives an important quantitive measure of performance. It is 
obtained as the ratio of the number of mi scl ass i f i ed pixels 
in the test areas to the total number of pixels in the test 
areas. Figure 4 shows results obtained for each of the four 
data sets.* The results are about what one would expect. 
Method #1 consistently has the highest error rate because of 
its lack of use of spatial dependence. *2 uses some spatial 
information and consistently coes somewhat better than #1. 
#3 uses more spatial information, which accounts for its 
improvement over cell selection alone, and #4 does 
consistently better than #3 because It uses more of the 
available information In the portioning phase. 

* Each data set contains different classes from the general 
categories: agriculture, fjrest, town, mining, and water. 
Refer to reference 15 for details. 
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#5 and #6 usually provide the best performance/ oecause 
they are siven more a-prtorl information to begin with. One 
reason for Including them here Is to determine if either 
provider a distinct advantage over the other. On 3 of the 4 
data secs/ maximum likelihood sample classification achieved 
lower error rates than the minimum Bhattacharyya distance 
strategy. The differences are small however. This 
justifies our use of the ML strategy In #2-#4. Another 
reason for including them Is that the performance of #5 
provides a “goal" (but not c bound) for the performance of 
#3 and #4; I.e. the nearness cf the performance to this goal 
is an Indication of the effectiveness of the partitioning 
process alone. 

Although #3 appears to be fairly close to #4 in 
general/ it must be pointed out that the "optimum" 
combination of “Sj and which achieves this performance is 
somewhat unpredictable at this time. All that we can say of 
a general nature Is that «, tends to be effective at about 
.005 and at a smaller value such as .001 or G. 

The results for the supervised mode, however/ are much 
more stable. Figure 5 shows only the results for t=4/ which 
are not always the optimum results/ but they are within 1 % 
of the optimum in all 4 cases. Figure 5 shows a typical 
example of the effect of t on classification error rate. 

The results are not a sensitive function of the Level -1 
threshold/ c. The values c».25 (unsupervi sed mode) and 
c»15q (supervised mode/ 3 q ^6) usually provided the 
desired effect. 

The main advantage of Lhe unsupervised mode appears to 
be speed/ when classification complexity Is reasonably high. 
This is because the time saved by classifying pixels 
collectively can more than compensate for the time required 
to partition. For a LANDSAT-i data set classified with 4 
channels and 14 spectral classes, processor #3 required 22^ 
less CPU time than #1, in spite of the fact that the 
classification subroutine In #1 is coded in assembler 
language for peak efficiency. (It has been estimated that 
this Increases its efficiency by about 50%. ) *5 -nd #4 are 
just developmental versions coded In FORTRAN. 8ut for an 
aircraft data set with 6 channels and 17 spectral classes, 
#4 required 26% less timr and #3 required 56% less time than 
# 1 . 
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V. CONCLUSION 

Wc have successfully exploited the redundancy of states 
that Is characteristic of sampled Imagery of ground scenes 
to achieve better accuracy and reduce the number of actual 
classifications required. The only training used Is the 
same as that required by a conventional maximum likelihood, 
no»memory classifier, l.e. estimates of the 
class-conditional, marginal densities for a single pixel. 
Thus we have not relied on specific spatial features, 
tertural Information Cclass-condl t lonal spatial 
correlation), or on the contextual Information associated 
with spatial relationships of objects. 
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Figure 1 Basic Flow Chart for a Two-Level, Conjunctive, Partitioning Algorithm 
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